2-d Processing of Speech with Application to Pitch Estimation
نویسنده
چکیده
In this paper, we introduce a new approach to two-dimensional (2-D) processing of the one-dimensional (1-D) speech signal in the time-frequency plane. Specifically, we obtain the shortspace 2-D Fourier transform magnitude of a narrowband spectrogram of the signal and show that this 2-D transformation maps harmonically-related signal components to a concentrated entity in the new 2-D plane. We refer to this series of operations as the “grating compression transform” (GCT), consistent with sine-wave grating patterns in the spectrogram reduced to smeared impulses. The GCT forms the basis of a speech pitch estimator that uses the radial distance to the largest peak in the GCT plane. Using an average magnitude difference between pitch-contour estimates, the GCT-based pitch estimator is shown to compare favorably to a sine-wave-based pitch estimator for all-voiced speech in additive white noise. An extension to a basis for two-speaker pitch estimation is also proposed.
منابع مشابه
Low-Complexity Pitch Estimation Based on Phase Differences Between Low-Resolution Spectra
Detection of voiced speech and estimation of the pitch frequency are important tasks for many speech processing algorithms. Pitch information can be used, e.g., to reconstruct voiced speech corrupted by noise. In automotive environments, driving noise especially affects voiced speech portions in the lower frequencies. Pitch estimation is therefore important, e.g., for in-car-communication syste...
متن کاملA Pitch Detection Algorithm Based on Special Points and Area
Pitch detection and estimation is a very important problem in speech signal processing. Now some scholar has presented a simple and effective method in pitch detection. It lessens the computing burden, but still has some defects for practical application. Here we improve this simple algorithm effectively, and introduce a method based on positive-negative area into it for pitch detection. Its go...
متن کامل2-d Processing of Speech for Multi-pitch Analysis
This paper introduces a two-dimensional (2-D) processing approach for the analysis of multi-pitch speech sounds. Our framework invokes the short-space 2-D Fourier transform magnitude of a narrowband spectrogram, mapping harmonicallyrelated signal components to multiple concentrated entities in a new 2-D space. First, localized time-frequency regions of the spectrogram are analyzed to extract pi...
متن کاملEmpirical Mode Decomposition for Advanced Speech Signal Processing
Empirical mode decomposition (EMD) is a newly developed tool to analyze nonlinear and non-stationary signals. It is used to decompose any signal into a finite number of time varying subband signals termed as intrinsic mode functions (IMFs). Such data adaptive decomposition is recently used in speech enhancement. This study presents the concept of EMD and its application to advanced speech signa...
متن کاملMulti-pitch estimation by a joint 2-d representation of pitch and pitch dynamics
Multi-pitch estimation of co-channel speech is especially challenging when the underlying pitch tracks are close in pitch value (e.g., when pitch tracks cross). Building on our previous work in [1], we demonstrate the utility of a two-dimensional (2-D) analysis method of speech for this problem by exploiting its joint representation of pitch and pitch-derivative information from distinct speake...
متن کامل